Me and my friends: gene mention normalization with background knowledge

نویسندگان

  • Jörg Hakenberg
  • Loic Royer
  • Conrad Plake
  • Hendrik Strobelt
  • Michael Schroeder
چکیده

“Tell me who your friends are, and I will tell you who you are” – this proverb best illustrates our approach to the normalization of gene names. In this approach, we rely on background knowledge that describes various aspects of a gene: it is localized on a chromosomal band, it belongs to an operon structure, it is a member of a gene family, its products take part in biological processes, they fulfil molecular functions, they occur at dedicated cellular locations, mutations of the gene ultimately cause diseases, its proteins contain domains and form secondary, tertiary and quaternary structures. Whenever a gene (or one of its products) is discussed, some of these aspects –the gene’s friends, that is, semantically related information– will be mentioned as well. The paradigm we follow with this approach demands not only the presence of a gene’s name, but also of some of its friends. We see every set of information available (see Methods section) for each gene as this gene’s description, or the context this gene typically “lives” in. Whenever we encounter ambiguities regarding proper identification of a gene, we assess each potential candidate gene by comparing its typical context against the predicted one (in this case, a PubMed abstract.) The descriptions for genes originate from various curated resources: EntrezGene provides organisms, summaries, chromosomal loci, Gene Ontology (GO) terms, and encoded proteins; UniProt provides functional descriptions, protein domains, interaction partners, keywords, and GO terms; more GO annotations are provided by GOA. Consider the example of the oncogene p54 (reflected in Figure 1.) Having resolved the issue of potential organisms, there are still human genes from EntrezGene that share the same name. Indeed, they refer to completely different genes with disjoint annotations. Based on the name alone, this problem could not be solved. Only comparing each of the gene’s contexts to the text reveals that one of the potential candidates is a RNA helicase, and the text indeed mentions “RNA helicase.” The text also mentions the exact chromosomal location of the correct gene.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

STATIC TIMING ANALYSIS OF GASP by Prasad

Acknowledgements I sincerely thank my advisor, Dr. Peter Beerel for inspiring me to take up one of the most significant academic challenges that I have taken in my life. He has been the best advisor a graduate student can wish for and without his support, guidance and patience; this work would not have been completed. I also thank his wife Janet and baby Kira for considering me as a part of the...

متن کامل

Linear Programming Algorithms Using Least-Squares Method

To my beloved wife, Yuijn iii ACKNOWLEDGEMENTS First, I want to thank my wife Yujin for her enduring patience and loving support during my study. She has been always there for me from the very beginning to this very end. She has been at the core of my motivation even when I was doubting if could finish this program. I also have to mention my beloved children, Lynn and Olivia, who have been an e...

متن کامل

Reputation Assessment in Collaborative Environments

Acknowledgements I thank my family who supported me day by day, even in the worst ones when I can be very grumpy and intractable. I would not have achieved this goal without them. A special thank to my advisor, prof. Alberto Trombetta, who led me in this difficult path being always very patient and supportive. I want to acknowledge also 7pixel society, with a special mention to Nicola Lamberti,...

متن کامل

The Linguistic Analysis of Chinese Emoticon

iv ACKNOWLEDGMENTS First, I gratefully acknowledge my advisor, Zhongwei Shen, for granting me the opportunity to pursue my master's degree under his guidance. Since I'm certainly not his typical student type: hard working, always meeting deadlines and attention to details. There's no way for me to imagine how he survived for so many years to deal with me, from undergraduate to graduate school. ...

متن کامل

Excitability and Laser Localized Structures in Semiconductor Microcavities

Acknowledgments First of all, I am especially grateful to my supervisor, Stéphane Barland, for the patient guidance, the support and the advices he has provided throughout my thesis. I feel honoured to have worked with such a passionate and brilliant researcher. Second of all, I am keen to thank Giovanna Tissoni, who has supervised the numerical work of this thesis and who has been a good frien...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007